MEIT: Memory Efficient Itemset Tree for Targeted Association Rule Mining

نویسندگان

  • Philippe Fournier-Viger
  • Espérance Mwamikazi
  • Ted Gueniche
  • Usef Faghihi
چکیده

The Itemset Tree is an efficient data structure for performing targeted queries for itemset mining and association rule mining. It is incrementally updatable by inserting new transactions and it provides efficient querying and updating algorithms. However, an important limitation of the IT structure, concerning scalability, is that it consumes a large amount of memory. In this paper, we address this limitation by proposing an improved data structure named MEIT (Memory Efficient Itemset Tree). It offers an efficient node compression mechanism for reducing IT node size. It also performs on-the-fly node decompression for restoring compressed information when needed. An experimental study with datasets commonly used in the data mining literature representing various types of data shows that MEIT are up to 60 % smaller than IT (43% on

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementation of Efficient Algorithm for Mining High Utility Itemsets in Distributed and Dynamic Database

Association Rule Mining (ARM) is finding out the frequent itemsets or patterns among the existing items from the given database. High Utility Pattern Mining has become the recent research with respect to data mining. The proposed work is High Utility Pattern for distributed and dynamic database. The traditional method of mining frequent itemset mining embrace that the data is astride and sedent...

متن کامل

An Accelerator for Frequent Itemset Mining from Data Streams with Parallel Item Tree

Frequent itemset mining attempts to find frequent subsets in a transaction database. In this era of big data, demand for frequent itemset mining is increasing. Therefore, the combination of fast implementation and low memory consumption, especially for stream data, is needed. In response to this, we optimize an online algorithm, called Skip LC-SS algorithm [1], for hardware. In this paper, we p...

متن کامل

Comparative Study of Frequent Itemset Mining Algorithms: FP growth, FIN, Prepost + and study of Efficiency in terms of Memory Consumption, Scalability and Runtime

Data mining represents the process of extracting interesting and previously unknown knowledge (patterns) from data. Frequent pattern mining has become an important data mining technique and has been a focused area in research field. Frequent patterns are patterns that appear in a data set most frequently. Various methods have been proposed to improve the performance of frequent pattern mining a...

متن کامل

A Survey on Efficient Incremental Algorithm for Mining High Utility Itemsets in Distributed and Dynamic Database

Data Mining is the process of analyzing data from different perspectives and summarizing it into useful information. It can be defined as the activity that extracts information contained in very large database. That information can be used to increase the revenue or cut costs. Association Rule Mining (ARM) is finding out the frequent itemsets or patterns among the existing items from the given ...

متن کامل

A Framework for Efficient Association Rule Mining in XML Data

In this paper, we propose a framework, called XAR-Miner, for mining ARs from XML documents efficiently. In XAR-Miner, raw data in the XML document are first preprocessed to transform to either an Indexed XML Tree (IX-tree) or Multi-relational Databases (Multi-DB), depending on the size of XML document and memory constraint of the system, for efficient data selection and AR mining. Concepts that...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013